A Taxonomy Framework for Unsupervised Outlier Detection Techniques for Multi-Type Data Sets
نویسندگان
چکیده
The term “outlier” can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework.
منابع مشابه
Combining of Magnitude and Direction of Change Indices to Unsupervised Change Detection in Multitemporal Multispectral Remote Sensing Images
In remote sensing, image-based change detection techniques, analyze two images acquired over the same area at different times t1 and t2 to identify the changes occurred on the Earth's surface. Change detection approaches are mainly categorized as supervised and unsupervised. Generating the change index is a key step for change detection in multi-temporal remote sensing images. Unsupervised chan...
متن کاملOutlier Detection in WSN- A Survey
In the field of wireless sensor networks, the measurements that deviate from the normal behaviour of sensed data are taken to be as outliers. The potential sources of outliers can be noise and errors, events, and malicious attacks on the network. This paper give an overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Also, a technique-based ...
متن کاملA Multi-Stage Intrusion Detection Approach for Network Security
Nowadays, the massive increment in applications running on a computer and excessive in network services forces to take convenient security policies into an account. Many methods of intrusion detection proposed to provide security in a computer system and network using data mining methods. These methods comprise of the outlier, unsupervised and supervised methods. As we know, each data mining me...
متن کاملPenalized unsupervised learning with outliers.
We consider the problem of performing unsupervised learning in the presence of outliers - that is, observations that do not come from the same distribution as the rest of the data. It is known that in this setting, standard approaches for unsupervised learning can yield unsatisfactory results. For instance, in the presence of severe outliers, K-means clustering will often assign each outlier to...
متن کاملHistogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm
Unsupervised anomaly detection is the process of nding outliers in data sets without prior training. In this paper, a histogrambased outlier detection (HBOS) algorithm is presented, which scores records in linear time. It assumes independence of the features making it much faster than multivariate approaches at the cost of less precision. A comparative evaluation on three UCI data sets and 10 s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007